AITopics | sub-goal generator

Collaborating Authors

sub-goal generator

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimistic Reinforcement Learning-Based Skill Insertions for Task and Motion Planning

Liu, Gaoyuan, de Winter, Joris, Durodie, Yuri, Steckelmacher, Denis, Nowe, Ann, Vanderborght, Bram

arXiv.org Artificial IntelligenceOct-17-2025

Abstract--T ask and motion planning (T AMP) for robotics manipulation necessitates long-horizon reasoning involving versatile actions and skills. While deterministic actions can be crafted by sampling or optimizing with certain constraints, planning actions with uncertainty, i.e., probabilistic actions, remains a challenge for T AMP . On the contrary, Reinforcement Learning (RL) excels in acquiring versatile, yet short-horizon, manipulation skills that are robust with uncertainties. Besides the policy, a RL skill is defined with data-driven logical components that enable the skill to be deployed by symbolic planning. A plan refinement sub-routine is designed to further tackle the inevitable effect uncertainties. In the experiments, we compare our method with baseline hierarchical planning from both T AMP and RL fields and illustrate the strength of the method. The results show that by embedding RL skills, we extend the capability of T AMP to domains with probabilistic skills, and improve the planning efficiency compared to the previous methods. Reinforcement Learning (RL) empowers robots to acquire manipulation skills without human programming. However, prior works mostly tackle single-skill or short-term manipulation tasks, such as grasping [1] or peg insertion [2] or synergies between two actions [3]. The long-horizon manipulation planning remains a challenge in the RL field because of expanding state/action spaces and sparse rewards etc [4].

machine learning, reinforcement learning, state discriminator, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2024.3398402

2510.14065

Country: Europe > Belgium (0.14)

Genre: Research Report (0.84)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.95)

Add feedback

Sub-goal Distillation: A Method to Improve Small Language Agents

Hashemzadeh, Maryam, Stengel-Eskin, Elias, Chandar, Sarath, Cote, Marc-Alexandre

arXiv.org Artificial IntelligenceMay-4-2024

While Large Language Models (LLMs) have demonstrated significant promise as agents in interactive tasks, their substantial computational requirements and restricted number of calls constrain their practical utility, especially in long-horizon interactive tasks such as decision-making or in scenarios involving continuous ongoing tasks. To address these constraints, we propose a method for transferring the performance of an LLM with billions of parameters to a much smaller language model (770M parameters). Our approach involves constructing a hierarchical agent comprising a planning module, which learns through Knowledge Distillation from an LLM to generate sub-goals, and an execution module, which learns to accomplish these sub-goals using elementary actions. Subsequently, we utilize this annotated data to fine-tune both the planning and execution modules. Importantly, neither module relies on real-time access to an LLM during inference, significantly reducing the overall cost associated with LLM interactions to a fixed cost. In ScienceWorld, a challenging and multi-task interactive text environment, our method surpasses standard imitation learning based solely on elementary actions by 16.7% (absolute). Our analysis highlights the efficiency of our approach compared to other LLM-based methods. Recently, Large Language Models (LLMs) have found applications in various fields, including multi-task learning, decision making, answering questions, summarizing documents, translating languages, completing sentences, and serving as search assistants. The promising advantage of LLMs is attributed to their training on extensive text datasets, resulting in impressive capabilities. This prior knowledge can be leveraged for action planning to solve tasks in robotics and reinforcement learning (Huang et al., 2022b; Brohan et al., 2023; Liang et al., 2023). However, the extreme size of LLMs makes them computationally unaffordable for many applications. Consequently, there is an increasing demand to find approaches that are less computationally intensive while still capitalizing on the knowledge embedded in LLMs. One prevalent technique is the use of Knowledge Distillation (KD) (Buciluǎ et al., 2006; Hinton et al., 2015), wherein a smaller model is trained with guidance from a larger model. Through this approach, we can leverage the knowledge in an LLM to train a more compact model with a reduced number of parameters. First, focus on the substance. Figure 1: Example of annotating an expert trajectory with sub-goals for a particular variation of task 1-4 We employ Knowledge Distillation from an LLM to train (change-the-state-of-matter-of).

generator, sub-goal generator, trajectory, (13 more...)

arXiv.org Artificial Intelligence

2405.02749

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback